Unsupervised Resolution of Objects and Relations on the Web
نویسندگان
چکیده
The task of identifying synonymous relations and objects, or Synonym Resolution (SR), is critical for high-quality information extraction. The bulk of previous SR work assumed strong domain knowledge or hand-tagged training examples. This paper investigates SR in the context of unsupervised information extraction, where neither is available. The paper presents a scalable, fully-implemented system for SR that runs in O(KN log N) time in the number of extractions N and the maximum number of synonyms per word, K. The system, called RESOLVER, introduces a probabilistic relational model for predicting whether two strings are co-referential based on the similarity of the assertions containing them. Given two million assertions extracted from the Web, RESOLVER resolves objects with 78% precision and an estimated 68% recall and resolves relations with 90% precision and 35% recall.
منابع مشابه
Unsupervised Methods for Determining Object and Relation Synonyms on the Web
The task of identifying synonymous relations and objects, or synonym resolution, is critical for high-quality information extraction. This paper investigates synonym resolution in the context of unsupervised information extraction, where neither hand-tagged training examples nor domain knowledge is available. The paper presents a scalable, fullyimplemented system that runs in O(KN log N) time i...
متن کاملارزیابی تطبیقی کارایی ساختار فراداده نظامهای شناسگر دیجیتالی
The main solution to the problems of persistency and uniqueness in identification of digital objects in a web environment is provided by using digital identifiers instead of URL. The main basis of this solution is resolution mechanism that is used in digital identifier systems. Resolution is the use of indirect names instead of URLs; what worked for the DNS (Domain Name System) in stabilizing i...
متن کاملObject-Oriented Method for Automatic Extraction of Road from High Resolution Satellite Images
As the information carried in a high spatial resolution image is not represented by single pixels but by meaningful image objects, which include the association of multiple pixels and their mutual relations, the object based method has become one of the most commonly used strategies for the processing of high resolution imagery. This processing comprises two fundamental and critical steps towar...
متن کاملKohonen Self Organizing for Automatic Identification of Cartographic Objects
Automatic identification and localization of cartographic objects in aerial and satellite images have gained increasing attention in recent years in digital photogrammetry and remote sensing. Although the automatic extraction of man made objects in essence is still an unresolved issue, the man made objects can be extracted from aerial photos and satellite images. Recently, the high-resolution s...
متن کاملمدلسازی روابط توپولوژیک سه بعدی فازی در محیط GIS
Nowadays, geospatial information systems (GIS) are widely used to solve different spatial problems based on various types of fundamental data: spatial, temporal, attribute and topological relations. Topological relations are the most important part of GIS which distinguish it from the other kinds of information technologies. One of the important mechanisms for representing topological relations...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007